Compressing Provenance Graphs

نویسندگان

  • Yulai Xie
  • Kiran-Kumar Muniswamy-Reddy
  • Darrell D. E. Long
  • Ahmed Amer
  • Dan Feng
  • Zhipeng Tan
چکیده

The provenance community has built a number of systems to collect provenance, most of which assume that provenance will be retained indefinitely. However, it is not cost-effective to retain provenance information inefficiently. Since provenance can be viewed as a graph, we note the similarities to web graphs and draw upon techniques from the web compression domain to provide our own novel and improved graph compression solutions for provenance graphs. Our preliminary results show that adapting web compression techniques results in a compression ratio of 2.12:1 to 2.71:1, which we can improve upon to reach ratios of up to 3.31:1.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SGProv: Summarization Mechanism for Multiple Provenance Graphs

Scientific workflow management systems (SWfMS) are powerful tools in the automation of scientific experiments. Several workflow executions are necessary to accomplish one scientific experiment. Data provenance, typically collected by SWfMS during workflow execution, is important to understand, reproduce and analyze scientific experiments. Provenance is about data derivation, thus it is typicall...

متن کامل

Explorer Provenance segmentation

Using pervasive provenance to secure mainstream systems has recently attracted interest from industry and government. Recording, storing and managing all of the provenance associated with a system is a considerable challenge. Analyzing the resulting noisy, heterogeneous, continuously-growing provenance graph adds to this challenge, and apparently necessitates segmentation, that is, approximatin...

متن کامل

Database Support for Exploring Scientific Workflow Provenance Graphs

Provenance graphs generated from real-world scientific workflows often contain large numbers of nodes and edges denoting various types of provenance information. A standard approach used by workflow systems is to visually present provenance information by displaying an entire (static) provenance graph. This approach makes it difficult for users to find relevant information and to explore and an...

متن کامل

Composition and Substitution in Provenance and Workflows

It is generally accepted that any comprehensive provenance model must allow one to describe provenance at various levels of granularity. For example, if we have a provenance graph of a process which has nodes to describe subprocesses, we need a method of expanding these nodes to obtain a more detailed provenance graph. To date, most of the work that has attempted to formalize this notion has ad...

متن کامل

Provenance Segmentation

Using pervasive provenance to secure mainstream systems has recently attracted interest from industry and government. Recording, storing and managing all of the provenance associated with a system is a considerable challenge. Analyzing the resulting noisy, heterogeneous, continuously-growing provenance graph adds to this challenge, and apparently necessitates segmentation, that is, approximatin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011